Search results for "reinforcement learning"
showing 10 items of 95 documents
A Reinforcement Learning Approach for User Preference-aware Energy Sharing Systems
2021
Energy Sharing Systems (ESS) are envisioned to be the future of power systems. In these systems, consumers equipped with renewable energy generation capabilities are able to participate in an energy market to sell their energy. This paper proposes an ESS that, differently from previous works, takes into account the consumers’ preference, engagement, and bounded rationality. The problem of maximizing the energy exchange while considering such user modeling is formulated and shown to be NP-Hard. To learn the user behavior, two heuristics are proposed: 1) a Reinforcement Learning-based algorithm, which provides a bounded regret and 2) a more computationally efficient heuristic, named BPT- ${K}…
Virtual Resource Allocation for Wireless Virtualized Heterogeneous Network with Hybrid Energy Supply
2022
In this work, two novel virtual user association and resource allocation algorithms are introduced for a wireless virtualized heterogeneous network with hybrid energy supply. In the considered system, macro base stations (MBSs) are supplied by the grid power and small base stations (SBSs) have the energy harvesting capability in addition to the grid power supplement. Multiple infrastructure providers (InPs) own the physical resources, i.e., BSs and radio resources. The Mobile Virtual Network Operators (MVNOs) are able to recent these resources from the InPs and operate the virtualized resources for providing services to different users. In particular, aiming to maximize the overall utility …
A comparison between a two feedback control loop and a reinforcement learning algorithm for compliant low-cost series elastic actuators
2020
Highly-compliant elastic actuators have become progressively prominent over the last years for a variety of robotic applications. With remarkable shock tolerance, elastic actuators are appropriate for robots operating in unstructured environments. In accordance with this trend, a novel elastic actuator was recently designed by our research group for Serpens, a low-cost, open-source and highly-compliant multi-purpose modular snake robot. To control the newly designed elastic actuators of Serpens, a two-feedback loops position control algorithm was proposed. The inner controller loop is implemented as a model reference adaptive controller (MRAC), while the outer control loop adopts a fuzzy pr…
Evolution and Learning: Evolving Sensors in a Simple MDP Environment
2003
Natural intelligence and autonomous agents face difficulties when acting in information-dense environments. Assailed by a multitude of stimuli they have to make sense of the inflow of information, filtering and processing what is necessary, but discarding that which is unimportant. This paper aims at investigating the interactions between evolution of the sensorial channel extracting the information from the environment and the simultaneous individual adaptation of agent-control. Our particular goal is to study the influence of learning on the evolution of sensors, with learning duration being the tunable parameter. A genetic algorithm governs the evolution of sensors appropriate for the a…
Reinforcement learning approach to nonequilibrium quantum thermodynamics
2021
We use a reinforcement learning approach to reduce entropy production in a closed quantum system brought out of equilibrium. Our strategy makes use of an external control Hamiltonian and a policy gradient technique. Our approach bears no dependence on the quantitative tool chosen to characterize the degree of thermodynamic irreversibility induced by the dynamical process being considered, require little knowledge of the dynamics itself and does not need the tracking of the quantum state of the system during the evolution, thus embodying an experimentally non-demanding approach to the control of non-equilibrium quantum thermodynamics. We successfully apply our methods to the case of single- …
Learning competitive pricing strategies by multi-agent reinforcement learning
2003
Abstract In electronic marketplaces automated and dynamic pricing is becoming increasingly popular. Agents that perform this task can improve themselves by learning from past observations, possibly using reinforcement learning techniques. Co-learning of several adaptive agents against each other may lead to unforeseen results and increasingly dynamic behavior of the market. In this article we shed some light on price developments arising from a simple price adaptation strategy. Furthermore, we examine several adaptive pricing strategies and their learning behavior in a co-learning scenario with different levels of competition. Q-learning manages to learn best-reply strategies well, but is e…
Validation of a Reinforcement Learning Policy for Dosage Optimization of Erythropoietin
2007
This paper deals with the validation of a Reinforcement Learning (RL) policy for dosage optimization of Erythropoietin (EPO). This policy was obtained using data from patients in a haemodialysis program during the year 2005. The goal of this policy was to maintain patients' Haemoglobin (Hb) level between 11.5 g/dl and 12.5 g/dl. An individual management was needed, as each patient usually presents a different response to the treatment. RL provides an attractive and satisfactory solution, showing that a policy based on RL would be much more successful in achieving the goal of maintaining patients within the desired target of Hb than the policy followed by the hospital so far. In this work, t…
An AI Walk from Pharmacokinetics to Marketing
2009
This work is intended for providing a review of reallife practical applications of Artificial Intelligence (AI) methods. We focus on the use of Machine Learning (ML) methods applied to rather real problems than synthetic problems with standard and controlled environment. In particular, we will describe the following problems in next sections: • Optimization of Erythropoietin (EPO) dosages in anaemic patients undergoing Chronic Renal Failure (CRF). • Optimization of a recommender system for citizen web portal users. • Optimization of a marketing campaign. The choice of these problems is due to their relevance and their heterogeneity. This heterogeneity shows the capabilities and versatility …
Weeds sampling for map reconstruction: a Markov random field approach
2012
In the past 15 years, there has been a growing interest for the study of the spatial repartition of weeds in crops, mainly because this is a prerequisite to herbicides use reduction. There has been a large variety of statistical methods developped for this problem ([5], [7], [10]). However, one common point of all of these methods is that they are based on in situ collection of data about weeds spatial repartition. A crucial problem is then to choose where, in the eld, data should be collected. Since exhaustive sampling of a eld is too costly, a lot of attention has been paid to the development of spatial sampling methods ([12], [4], [6] [9]). Classical spatial stochastic model of weeds cou…
MARL-Ped: A multi-agent reinforcement learning based framework to simulate pedestrian groups
2014
Abstract Pedestrian simulation is complex because there are different levels of behavior modeling. At the lowest level, local interactions between agents occur; at the middle level, strategic and tactical behaviors appear like overtakings or route choices; and at the highest level path-planning is necessary. The agent-based pedestrian simulators either focus on a specific level (mainly in the lower one) or define strategies like the layered architectures to independently manage the different behavioral levels. In our Multi-Agent Reinforcement-Learning-based Pedestrian simulation framework (MARL-Ped) the situation is addressed as a whole. Each embodied agent uses a model-free Reinforcement L…